The Influence of Decoys on the Noise and Dynamics of Gene Expression 
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Many transcription factors bind to DNA with a remarkable lack of specificity, so that regulatory 
binding sites compete with an enormous number of non-regulatory 'decoy' sites. For an auto- 
regulated gene, we show decoy sites decrease noise in the number of unbound proteins to a Poisson 
limit that results from binding and unbinding. This noise buffering is optimized for a given protein 
concentration when decoys have a 1/2 probability of being occupied. Decoys linearly increase the 
time to approach steady state and exponentially increase the time to switch epigenetically between 
bistable states. 

PACS numbers: 



A transcription factor must bind to a specific site in 
the genome to regulate the expression of a gene. This 
process does not occur in isolation. Instead, actual reg- 
ulatory target sequences must be distinguished from an 
entire genome of alternative possible binding sites. In 
prokaryotes, the typical transcription factor binding mo- 
tif is sufficiently specific that a regulatory target can be 
distinguished from decoys by its binding free energy alone 
as a roughly unique location in the genome [3] . Although 
eukaryotic genomes are much longer, the binding speci- 
ficity of some eukaryotic transcription factor binding mo- 
tifs can be so low that up to tens of thousands of strong 
affinity binding sites can be expected by pure chance [1] . 
Such decoy sites have been identified in repetitive non- 
coding regions [3] . Mutations in these regions have been 
implicated in several diseases, suggesting that the non- 
regulatory binding of transcription factors to DNA could 
serve some currently unknown function, a question that 
is being explored in synthetically engineered systems [5]. 

A ubiquitous regulatory motif involves a single 
"generic" transcription factor that is responsible for reg- 
ulating the expression of many genes [B]. As a result, 
the functional site of one gene may also serve as a de- 
coy site for another gene. Additionally, it is known that 
active degradation of transcription factors plays an essen- 
tial role in regulating eukaryotic gene expression. Under 
certain conditions binding can sterically inhibit-or even 
prohibit-degradation [7] , so that several eukaryotic tran- 
scription factors are protected from degradation while 
bound to DNA [8 . Previously we have shown [5] that 
when the bound transcription factors are protected the 
mean steady state number of unbound transcription fac- 
tors, (n), does not change as decoys are added. Instead, 
the total number of transcription factors, N, adjusts to 
satisfy the binding to decoys and thus decoys do not 
change the deterministic behavior of the system. In this 
paper we provide an analytical theory of how the noise 
characteristics and approach to steady state of the system 
are altered by decoy sites that confer stability through 



this "asylum" mechanism. 
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FIG. 1: A. Model of a generic auto-activating gene where 
transcription factors bind to a regulatory promoter site (red) 
as well as M identical, non-regulatory decoy binding sites 
(yellow) B. Since they protect bound proteins from degrada- 
tion, decoy binding sites do not alter the steady state mean 
unbound copy number of a unimodal probability distribution, 
(n), yet they decrease the variance a^. C. Similarly, the de- 
terministic fixed points of a bistable system, {(a), (b), (c)} do 
not change, but when decoys are added the relative stability 
of the expression states, LOW (n < (b)) and HIGH (n > (&}), 
is altered. 

The Model. To elucidate the general effect of de- 
coys on gene expression we model an auto- activated gene 
surrounded by a collection of M identical decoy bind- 
ing sites that do not themselves directly regulate tran- 
scription but do protect bound proteins from degrada- 
tion (Fig. [T]). To describe this system we use a mas- 
ter equation (see Appendix A) where each state is de- 
scribed by three indices: the occupancy of the promoter, 
i £ {unbound (0), bound (1)}, the number of bound de- 
coys, m, and the number of unbound proteins, n. Solving 
this master equation numerically allows us to study prop- 
erties of the steady state probability distribution over 
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unbound copy numbers, p n = J2i,mPi,m,n(t = oo). The 
reactions represented in the master equation include pro- 
tein production n n + 1, degradation n — ^> n — 1, 

promoter binding, (i, n) — P ^ — h (i + l,n— q), pro- 

moter unbinding, (z, n) — — > (i — 1, n + q), decoy binding, 
(m,n) — — — K (m + l,n — q), and decoy unbind- 
ing, (m,n) ^ d? "> (rn — l,n + g). The binding process 
encoded in the function is described for x G {p, d} 
as H x (n) = h x n for binding of monomers (q = 1) and 
H x (n) = ^h x n(n — 1) for binding of dimers (q = 2). 
We define a site equilibrium constant n x = f x /h x for 
q = 1 and n x — \/2f x /h x for q = 2 that corresponds to 
a binding free energy E x such that n x = e@ El , where 
/3 = (fefiT) -1 . To illustrate the invariant scalings it 
is convenient to introduce a factor S so that we write 
the production and promoter binding terms as gi = gtS 
and njj — nj p S. This results in (n) — ^2 n np n w (n)S. 
The equilibrium probability that a site is occupied is 
thus a Hill function, 9 x ({n)) = (n) q /((n^ D )i + (n) 9 ) 
which can also be written in terms of energy, such that 
9 X = 1/ (1 +cxp [PqAE]), where AE = E x - p, and 
fi = fcsrin(n). 

We focus on the limiting case where binding and un- 
binding are both much faster than production and degra- 
dation; the case of so called "adiabatic" genes. We take 
advantage of the separation in timescales to treat sepa- 
rately the fast fluctuations in unbound copy number-due 
to binding and unbinding events-from the slow fluctu- 
ations in unbound copy number-due to production and 
degradation events. The slowly varying component of 
the unbound copy number, n(N), is slaved to a con- 
stant total copy number, N = n + q ■ i + q ■ to, by as- 
suming binding equilibrium. n(N) satisfies the equation 
N = n(N) + q ■ 6 P [n(N)} + q ■ M ■ 6 d [n(N)\. In this 
adiabatic limit the full master equation for Pi, m ,n can be 
reduced to a one dimensional master equation in terms 
of the slow variable N, p^, by expressing the production 
and degradation rates as functions of n(N) (see Appendix 
B). This reduced master equation allows us to treat the 
problem with numerical ease and also find many results 
analytically. 

Numerical Results. To gain intuition we first nu- 
merically solve the master equations for two cases that 
are known to have qualitatively different dynamical and 
noise properties without decoys: monomer (q — 1) and 
dimer (q — 2) binding (see caption of Fig [2] for details) 
|17j . Dimer binding allows for bistability and switching 
between the two attractors, whereas in the adiabatic limit 
monomer binding yields a unimodal distribution easily 
characterized by simple measures such as the Fano fac- 
tor for noise (c^/(n)) and the mean relaxation time to 
steady state. In Fig [2] we see that adding decoys with a 
fixed binding energy (we use decoys that are half bound 
at steady state [IB]) quantitatively affects the gene ex- 



FIG. 2: Comparison of numerical (solid curves) and analyt- 
ical (dashed curves) results for gene expression properties as 
decoys are added for systems with varying mean unbound 
numbers of protein copies, (n). A. The Fano factor; B. The 
probability for the bistable system to be in the HIGH protein 
expression state, ip; C. time for the mean total copy num- 
ber to reach half the steady state value; D. epigenetic escape 
time. Numerical results in A are calculated by projecting the 
solutions of the 3D master equation for p n — Pi, m ,n, 
whereas the ID master equation for pjv is accurate for the 
results plotted in B, C, D (see Fig.[4]for details). Numerical 
calculations for the gene without decoys are used in the ana- 
lytical calculations. Parameters: gi — 100S 1 , go = SS, k = 1, 
rip = 53.25 for q = 1 which gives (n) = 50S. For q = 2, 
ipo = 0.5 is fixed such that n p = 10.3 for S = .2, n P = 21.0 
for S = 1, and n f p = 106.8 for S = 2. 



pression properties. However when the number of decoys 
is rescaled by the mean number of unbound proteins, the 
results for different choices of S collapse onto a common 
plot (see Fig [2] insets) indicating general principles that 
we explore below. 

We plot the dependence of the noise and dynamical 
properties of the system on the binding free energy of 
decoys E4 in Fig [3] In prokaryotic genomes, there is typ- 
ically a free energy penalty of 1 to 2kgT per mismatch 
with respect to the consensus binding motif. When there 
are 4 to 5 mismatches the binding becomes characteris- 
tic of background DNA 3 . Since most decoys will have 
a weaker binding affinity than the promoter, we concen- 
trate on discussing the large M, large n' d limit pTj . 

The steady state unbound Fano factor, er^/ (n), plotted 
in Fig.[2]A_ approaches Poisson noise as decoys are added, 
such that cr,^ M ~ > °°> (n) . In the limit of large numbers 
of decoys the slow fluctuations in unbound copy num- 
ber resulting from production and degradation events are 
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FIG. 3: Comparison of numerical (solid curves) and analyti- 
cal (dashed curves) for the same properties as in Fig. [2] as a 
function of the decoy binding energy Ed, for fixed numbers de- 
coys, M. The vertical dashed lines indicate the energies that 
correspond to the fixed points of the system. Parameters are 
the same as in Fig. [2] for (n) = 50. 

dominated by an effective birth-death process in which 
a relatively small number of particles bind and unbind 
to a large reservoir of sites. We see that systems having 
smaller mean numbers of proteins approach the Poisson 
limit for smaller values of M (compare blue and green 
curves in Fig. [2j\.) than those with larger mean protein 
numbers. Figure [3j^ shows that noise buffering is opti- 
mized for a particular value of the decoy binding energy, 
= fx. This corresponds to the case where decoys are 
half bound at steady state (n a * = (n)). Intuitively, the 
potential to buffer noise is maximized at E* d = fi since 
binding and unbinding events are most probable when 
sites are on average half-occupied. 

Although the mean steady state unbound protein copy 
number, (n), remains constant, adding decoys increases 
the mean steady state total protein number, (N) — 
(n) + M9d((n)). The relaxation time, Tx/2, (the time 
to reach (N(ti/ 2 )) = (N)/2, from an initial condition 
of (N(0)) = 0) increases linearly as decoys are added 
(Fig[2p) due to the time required to produce the proteins 
needed to satisfy binding equilibrium. Strongly binding 
decoys (Ed << fi) increase T1/2 the most because more 
proteins must be created (Figpfc). 

In a bistable system where proteins bind as dimers, the 
addition of decoys does not alter the three deterministic 
fixed points corresponding to the stable low expression, 
unstable intermediate expression, and stable high expres- 
sion levels, n = {(a), (b), (c)}. However, decoys are able 



to influence the ability of the system to stochastically 
transition between the stable global phenotypic states 
which we call the LOW and HIGH expression states (See 
Fig [lp) . The binding affinity of the decoys determines 
the change in the likelihood of observing the different ex- 
pression states. In Fig. [2j3 we see that decoys with a 
binding energy Ed = fi c = fcsTln(c) increase the prob- 
ability to be in the HIGH protein copy expression state 
by preferentially decreasing fluctuations in the protein 

buffer in the vicinity of n = (c), such that tp ~ fl/ ~ > °°> 1 
where ip = ^2 n> n,)Pn- On the other hand, decoys with 
a binding energy Ed — fi a = fcsTln(a) will act to sta- 
bilize the LOW protein copy number expression state, 
such that tp — > 0. We see that the epigenetic escape 
times, defined as the mean first passage times between 
the two steady states, T ac : n = (a) — > n — (c) and 
T ca '■ n — (c) — > n — (a) , increase exponentially as de- 
coys are added (Fig. (2h). The variation of tp with decoy 
binding energy (Fig]3B) shows that decoys with bind- 
ing energy Ed = fib stabilize neither state, however, they 
significantly increase the epigenetic escape rate by effec- 
tively stabilizing the transition state (Fig[3|3). 

Analytical Results. To understand the numerical 
observations in Figs. [2] and [3] we approximate the master 
equation for by a Fokker-Planck equation (see Ap- 
pendix B). Since the production and degradation rates 
depend only on the mean unbound protein concentration, 
n(N), the drift and diffusion terms in the Fokker-Planck 
equation are respectively the sum and difference of the 
effective production and degradation rates of a gene with- 
out decoys evaluated at n(N) from the self-consistent re- 
lation 

N = n(N) + qM6 d [n(N)\ . (1) 

To perform the dimensional reduction faithfully it 
is important to distinguish the unbound concentration 
given that the promoter is unbound, n, which determines 
the probability that the promoter is bound, p (n), from 
the net unbound concentration, h — q9 p (n), which de- 
termines the protein degradation rate. Although the re- 
duced model only captures the slow fluctuations in un- 
bound protein concentration we use it to understand the 
numerical results. 

The variance in the unbound protein concentration de- 
pends on both fast and slow fluctuations through the 
law of total variance, a 2 n = a 2 n slow + <r 2 n fast . The vari- 
ance from the slow fluctuations, cr^ slow , can be found by 
taking the small noise approximation of the slow vari- 
able Fokker-Planck equation. This result can be rewrit- 
ten in terms of the mean unbound protein concentra- 
tion, using a change of variables with the derivative, 
J(n) = dN/dh. In the presence of decoys the drift and 
diffusion functions expressed in terms of n are equal to 
those of a gene without decoys (M — 0). Thus one finds 
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a n slow = cr o/^(( n ))' where Cg is the variance without 
decoys (see Appendix B for details). 

To calculate the fast contribution to the variance, 
a n fast i we consider a master equation for the effective 
birth death process alone that comes from binding and 
unbinding of the free number of unbound proteins, n, to 
M decoy sites, with a constant given total number of pro- 
teins, N [15] . The steady state solution is found by recur- 
sion: p n]N = Po\ N (n\) n nH "L n y (m-^v+I!)! = exp[J-(n)]. 
In the limit of large N, M, and n, within the Stirling 
approximation we expand T(n) to second order about 
its fixed point. Setting d n [.F(n)] = recovers Eqn. [I] 
The variance from this Gaussian expansion of p n \M gives 
a n\N ~ ^ [1 — l/>^("-)]- The fast contribution to the 
unbound fluctuations is now found by averaging over 
the probability distributions of the total copy number, 
°ljast = Y.n°1\nPn « (n)[l-l/J{{n))] (see Ap- 
pendix C for details). 

Combining the fast and slow contributions yields 




+ <n) (2) 



This formula agrees well in the appropriate limits with 
numerical solutions of the full master equation, as shown 
in Figs. [2}\ and [3]A, and also holds for a model that 
includes translational bursting (see Appendix D). From 
Eq. [2] in the large M limit, we obtain the observed Pois- 
son noise, cr^ — !• (n). Noise reduction is proportional to 
the deviation from Poisson noise in a system without de- 
coys. Decoys will decrease noise for <r§ > (n) [20]. Eq.§ 
is minimized for n^* = (n) . Eq. [2] can be written as 
a function of Mj (n) and AE which results in the data 
collapse shown in the inset of Fig. [2jA.. 

To describe the noise buffering efficacy we quan- 
tify the number of decoys needed to reduce the super- 
Poissonian noise by a half, M1/2. We find M\ji = 
2(n) (1 + cosh AE) is independent of Oq. For decoys with 
optimum buffering capacities (AE* = 0), M : / 2 = 4(n) 
and M1/2 asymptotically doubles for every binding en- 
ergy increase of fcsTln2 (or doubling of n d ). 

The relaxation time to reach (N)/2 copies of pro- 
teins when initially there is no protein present can be 
obtained directly by integrating the deterministic equa- 
tion. In the limit of weak decoys, E& > /1, we find 
T1/2 = tq 1/2 + MAti/2, where Arx/2 is a correction due 
to decoys, recovering the linear increase of Tyn with de- 
coys seen in Fig. [2p. For very weak decoys, Ed >> /Lt, 
(or n\ >> (n)), J(n) « 1 + M/nl = const. Hence 

At\/2 ~ T"i/2,o/ n d ( see Appendix E for details). 

Within the Fokker-Planck approximation the epige- 
netic escape time can be found by expanding the effective 
potential about the fixed points to second order. In the 
limit that the barrier height is sufficiently large one finds 



T ac = T ac fly/J{(a))J((b))e MCab , where r aCi0 is the es- 
cape time without decoys and £ a h is a correction to the 
escape path action due to a single decoy. An analogous 
expression holds for escape from c to a. The escape times 
increase exponentially for large M as decoys are added. 

Since the model has been reduced to one dimension, 
the bimodal system obeys an effective detailed balance 
such that ipT ac = (1 — ip) T ca, where ip is the probability 
to be in the HIGH protein copy number expression state. 
Using the previous results for the escape times, 

When n f >(6), ( ac >0 such that ip M ^°°> ?. 

In summary, when there is a sufficient separation of 
timescales between slow protein production-degradation 
and fast binding-unbinding to the DNA, we have shown 
that decoys buffer gene expression noise. The fluctua- 
tions in binding and unbinding act as an effective birth- 
death process that imposes the Poisson limit on noise re- 
duction. Noise buffering is optimized for decoys that are 
half-occupied at the appropriate protein concentration. 

Not all gene regulatory systems function in the fully 
adiabatic limit explored here [12] . If binding and un- 
binding to decoys is much slower than the fluctuations 
in total copy number, decoys are unable to influence the 
steady state unbound protein expression. If binding and 
unbinding to the promoter become much slower than the 
fluctuations in total copy number, there are effectively 
two gene states with constant production rates. In this 
case the decoys have no impact on the steady state un- 
bound protein expression. 
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APPENDIX A: THE MASTER EQUATION 



We consider the full master equation for the time evo- 
lution of the joint probability distribution of the pro- 
moter occupancy (i £ {unbound (0), bound (1)}), the 
number of occupied decoy sites (m € {0, 1, 2, M}) 
and the number of unbound transcription factors (n € 
{0, 1,2, ...,n max }): 
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dtPi 



),m,n+gi 



9iPi,m,n—l 9iPi,m.n 

+ k(n + l)p i>m ,n+i - knpi 
+ (-l) 1 - l H p {n + qi) Po , 

+ ( — 1) fpPl,m,n-q(l-i) 

ff d (n + g )(M- (rn-l))p 

771 I Pi 7 rn,n 



i,m— 



Hd(n)(jl 



m+l,n— q TYipi 



.(Al) 



The reactions (and their rates) that change the state 
of the system are production (gi) and degradation (kn) 
of transcription factors, binding (H p (n)(l — i)) and un- 
binding to the promoter (f p i), binding (Hd(n)(M — m)) 
and unbinding to decoys (/dm). For x £ {p, d}, tran- 
scription factors can either bind as monomers (q = 1), 
where H x (n) = h x n, or as dimers (q = 2), where 
H x = \h x n{n - 1). Eq. [Al 



matrix diagonalization with n 



is solved numerically by 

max Tl S>. 
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APPENDIX B: SLOW FLUCTUATIONS IN 
UNBOUND COPY NUMBER 

Dimensional reduction. In the limit that bind- 
ing and unbinding is much faster than production and 
degradation of transcription factors, the occupancy of the 
binding sites reaches its steady state on a timescale that 
is faster than the fluctuations in the total copy number 
of transcription factors (N = n + qi + qm), which result 
from production and degradation events. We collapse the 



master equation (Eqn. Al ) to a single dimension that ac- 
counts for effective production, G(N), and degradation, 
K(N), of proteins: 



d tPN = G(N — 1)pn_x — G(N)pn 

K(N + l)p N+1 -K(N) PN 



(Bl) 



These rates are defined self-consistently as functions of 
the slowly varying component of the unbound transcrip- 
tion factors, n, which depends on the total number of 
transcription factors, N,: 



N = n(N) + qMO d [n(N% (B2) 
where the probability that a site is bound is given by 



ni+n(AT)' 



n(N)[n(N)-l] 
nt(nt-l)+n(N)[n(N)-l}' 



if Q 

if q 



(B3) 



FIG. 4: Validation of Dimension Reduction. Here we 
compare calculations from the full master equation, Eqn. |A1| 
(colored curves) with calculations from the one dimensional 
master equation, Eqn. Bl (black dashed lines). We see that 
the dimension reduction breaks down for small system size, 
(n), or strong decoys, Ed « fi- Parameters: gi = 100 ■ 
S, go = 8 • S = and in panels A and B, S = 1, rip = 53.2 n\ = 
10, in panels C, D, E, and F n£ = 10.3 for S = .2, n£ = 21.0 
for S = 1, and nt =106.8 for S —2, with n\ — u c . 



Equation |B2| does not include a term corresponding 
to binding to the promoter because the probability that 
the promoter is bound depends on the mean number of 
transcription factors given that the promoter is unbound. 
The production rate is a function of the probability that 
the promoter is bound: 



G(N) = <7 (1 - e p [n(N)]) + gi6 p {n(N)} 



(B4) 



The degradation rate is proportional to the net un- 
bound copy number, which includes the mean number of 
proteins bound to the promoter: 

K(N) = k\n(N)-q6 p [h(N)]\. (B5) 
The Fokker-Planck Approximation. We approx- 



imate Eqn. Bl with a one dimensional Fokker-Planck 
equation: 



d_ 

di. 



Pn 
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dN 



v(N) - -J*-D(N) 
y ' 2dN v ' 



PN, 



(B6) 



G 



with the drift, v(N) = G[n(N)]— K[n(N)}, and diffusion, 
D(N) = G[n(N)} + K[n(N)]. 

The steady state probability distribution of Eq. |B6| is 
given by: 



p(N) 



N 
D(N) 



exp 



T/ 2v(N) 



„ dN w) 



(B7) 



We define fixed points in total copy number, N — 
{(A), (B), (C)}, that correspond to the fixed points in 
unbound copy number, n — {(a), (6), (c)}. The mean 
escape time from TV = (A) to N = (C) is [10] : 



Performing a change of variables from N to n, the es- 
cape time becomes 



Tac = r ac ^J{{a))J{{b))e- M ^ 



(B13) 



where T ac Q is the mean escape time without decoys and 
Cab is the decoy perturbation to the action over the in- 
terval [(a), (&)]: 



Cab 



(a) D (n') 



4(4 



(B14) 



r {c) r Y dz 

tag = 2 / dY exp W(y) / exp [ - W(Z)] , 

J (A) JO U \ /j ) 

(B8) 

where 



W(N) = - 



N dN> 2 -^l. 
o D(N>) 



(B9) 



Small-Noise Approximation. Within a Gaussian 
approximation around = (AT), Eqn. B7 yields the vari- 
ance in the total protein copy number: 
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D{N) 



-2d N [v(N)] 
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One can obtain the variance in the slowly varying com- 
ponent of the unbound protein copies, fi, by perform- 
ing a change of variables on Equation |Bf 0| from N to 
n. The drift and diffusion functions evaluated for n are 
equivalent to that of a gene without decoys (vo(n) and 
D (n)). The derivative, Jin) = dN/dh, is calculated 
from Eqn. |B2| 



J(n) 



M 



M 



((n+) 2 +n 2 ) 2 ' 



for q = 1 
for g = 2 



where for dimers we approximate n(n — 1) 
the change of variables, 



n,slow 



D (n)/J(n) 



-2d n [v (n)} 



= (n) 



J((n)Y 



(Bll) 



After 



(B12) 



where (Tq is the variance of the gene without decoys. 
Similarly, within a Gaussian approximation about A^ 
•1 and N = (B), Eqn. lB8] becomes 



2n 



tac 



D((A))D((B)) 



D((A))V \d N v((A))\\d N v((B))\ 



f{B) 
1(A) 



dN- 



and T a 



Likewise we find r ca = T ca:QV /J 2 {(c})J 2 ((b))e M < b 
Equivalently, the same formulas for cr^ slow 
can be obtained by first performing the change of vari- 
ables on Eqn. |B6[ obtaining the effective drift v(n) — 
v (n)J- 1 (n) + l/2D a (n)J- 1 (n)dnJ~ 1 {n) and effective 
diffusion D(n) — Do(fi)J'~' 2 (n), followed by the small 
noise approximation. 



APPENDIX C: FAST FLUCTUATIONS IN THE 
UNBOUND COPY NUMBER 

To study the fast contribution to the variance in the 
number of unbound protein copies, due to binding and 
unbinding of monomers, we consider a master equation 
indexed over the number of unbound transcription fac- 
tors, n, given a constant total number of transcription 
factors, N. We neglect binding and unbinding to the 
promoter because we are interested in the limit of large 
numbers of decoy sites, M —¥ oo. 



dp n \N \, 

= Id [{N - n + l)p«-i|iv 



-(N - n)p n \ N 



(n+l)(M- N + n + l)p n+1{ 



-n(M -N + n) Pn \ N 



N 



(CI) 



The steady state probability distribution is found by 
recursion: 



n-l 



Pn\N 



po\n n 

Po\N(n V ) n 



f(N - I) 



h(£ + l)(M- N + £ + l) 
N\ (M-AT)! 



exp [.F(n)] 
exp (T(n)) exp 



n\(N-n)\ (M — N + n)\ 

[n — n,y 



dn 2 



(C2) 
.(C3) 
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In the last step we Gaussian expand J- for large M, N, 
and n within a Stirling expansion. Setting d/dn [T(n)] = 
recovers the deterministic result for the mean number 
of unbound protein copy numbers, n rj ^ n np n \^ for 
n >> 0, given in Eqn. |B2| The variance in the number 
of unbound protein copy numbers is: 



i\N 



d 2 F 



dn 2 



Mnl 



(n\ + nj 



Mnl 



J(n) 



(C4) 



The fast contribution to the unbound fluctuations is 
found by averaging over the probability distributions of 
the total copy number, p^, which is the steady state 
solution of Eqn. |B1| : 



formula derived above for u 2 



can be 



^n,slow ' ^n.fast 

applied to a constitutively produced bursty gene as fol- 
lows: 



(a 2 -(n))j-\(n)) + (n) 



B 



-) J-\(n)) + l 



(Dl) 



There are similar opposing effects between decoys and 
bursts when one considers the bimodal probability dis- 
tribution. Large bursts can eliminate bimodality by de- 
creasing the typical number of production events needed 
reach the transition state from a fixed point [15 , such 
that the probability of the HIGH state decreases. Adding 
decoys that stabilize the HIGH state (nt > (&)) can re- 
store bimodality in a bursty bimodal system. Similarly, 
bursts exponentially decrease the time to escape between 
states [16 , whereas decoys exponentially increase the 
time to escape between states. 



n,fast 



N 



(n) 



Mnl 



Mnl 



Pn 



Mnl 



(4 + (»> 



Mnl 



1 - 



J({n)) 



(C5) 



(C6) 



We have approximated the average of the function 
by the function of the average, which is valid for 



(n\ + (n) 



Mnl 



» 1. 



APPENDIX D: TRANSLATIONAL BURST NOISE 

Another source of noise in gene expression comes from 
multiple translation events of a single mRNA copy, so 
that proteins are effectively produced in bursts rather 
than one at a time [13]. Although our model does not 
include mRNA, we mimic the effects of bursting by spec- 
ifying that each production event results in an instan- 
taneous burst of B transcription factors with a reduced 
production rate of transcription factors, g — > g/ B, such 
that the average unbound number of transcription factors 
(n) does not change even though the variance increases. 
For a constitutively produced gene (where go = g%) the 
variance without decoys becomes cr 2 /(n) = (B + l)/2 
|14j . Decoy binding sites have the opposite effect on the 
variance to bursts - they decrease the variance without 
changing the mean expression (n). The noise buffering 



APPENDIX E: APPROACH TO STEADY STATE 

The time to reach half of the mean steady state expres- 
sion, Ti/2, starting from a mean of zero protein copies is 
found from the deterministic equation for the mean total 
copy number, d t (N(t)} = v(N) = vq [n(N)], to be: 



T l/2 



(N)/2 



dN- 



1 



(El) 



io «o HN)] ' 

Performing a change of variables from N to n yields: 



T l/2 



H(N)/2) j, n) 

o v {n) 



(E2) 



where the upper boundary is the mean unbound copy 
number n such that Eqn. B2 is evaluated for N = (N) /2 
for binding of monomers. 

Limit of weak decoys. For weak decoys (nt >> 
(n)), approximating Odiji) rj n/nt in Eqn 
in N « n(l + M/n\) and J(n) = dN/dn~ 
M/n d = const. In this limit the upper boundary be- 



B2 



results 
= 1 + 



comes n((N)/2) « (n)/2 and ri/2 = 
where Ar 1/2 ~ 



/2 



Limit of stro ng decoys. For strong decoys (n^ << 



(n)), Eqn. B2 becomes N « n + M(l — n^ /n), and 
J(n) ss 1 + Mn'Jn 2 . Therefore, unlike weak decoys that 
influence T1/2 independently of n, strong decoys have the 
most significant effect of increasing the time to reach the 
steady state (compared to the gene with no decoys) when 
n is small. 



In the limit of extremely strong decoys, each transcrip- 
tion factor that is produced binds to a decoy site and re- 
mains bound. As a result, until all decoys are saturated, 
the unbound copy number will be zero. There will be no 
transcription factors available to bind to the promoter 
and the production will be fixed at the basal production 
level, go- After saturation, however, strong decoys no 
longer influence the dynamics of the system. Therefore 
the time to approach steady state can be broken up into 
a basal production stage and an isolated gene stage. 

For M > (n), the time to reach half of the steady 
state number of proteins happens before the decoys are 
saturated - in the regime when transcription factors are 
produced with a rate go per unit time, 

r 1/2 = ^ = J ^, for M>(n>>> 4 (E3) 
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